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Abstract. Traditional methods for covariate adjustment of treatment 
means in designed experiments are inherently conditional on the ob- 
served covariate values. In order to develop a coherent general method- 
ology for analysis of covariance, we propose a multivariate variance 
components model for the joint distribution of the response and co- 
variates. It is shown that, if the design is orthogonal with respect to 
(random) blocking factors, then appropriate adjustments to treatment 
means can be made using the univariate variance components model 
obtained by conditioning on the observed covariate values. However, 
it is revealed that some widely used models are incorrectly specified, 
leading to biased estimates and incorrect standard errors. The approach 
clarifies some issues that have been the source of ongoing confusion in 
the statistics literature. 

Key words and phrases: Adjusted mean, blocking factor, conditional 
model, orthogonal design, randomized blocks design. 



1. INTRODUCTION 

This article concerns the adjustment of treatment 
means in designed experiments to account for one or 
more covariates. Analysis of covariance has a long 
history, dating back to Fisher (1934). Much of its 
development in the context of designed experiments 
followed soon after. Examples can be found in many 
classical textbooks, including Federer (1955), 
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Cochran and Cox (1957), Snedecor and Cochran 
(1967), as well as more recent texts such as Milliken 
and Johnson (2002). We are specifically interested 
in settings in which the covariates are random vari- 
ables, not fixed by the design. Also, we generally 
suppose that the covariate values are not affected 
by the treatments, for example, because they were 
measured prior to application of the treatments. The 
multivariate mixed model we propose can be modi- 
fied to handle problems in which this assumption is 
not valid. However, there are additional inferential 
issues in this case. 

As a canonical example we discuss in detail the 
randomized complete blocks (RCB) design with a 
single covariate. In the classical analysis of this de- 
sign the blocks are treated as fixed, and the covariate 
is included predictor. The least squares treat- 
ment means obtained from the fixed blocks model 
fit adjust the arithmetic treatment means to ac- 
count for differences in the average covariate mea- 
surements among the treatments. Including the co- 
variate block means as an additional predictor has 
no effect on the least squares fit because differences 
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at the block level are already accounted for. Treating 
blocks as fixed effectively confines the scope of infer- 
ence to only those blocks and covariate values in the 
study. In particular, the standard errors obtained 
from the fixed effects model for the least squares 
treatment means do not account for repeated sam- 
pling of blocks. 

However, in most applications the blocks in the 
study can be viewed as a random sample from a pop- 
ulation of interest, and it is of interest to extend the 
scope of inference to the population of blocks. An 
obvious modification of the classical model in this 
case is to simply treat the block effects as random; 
that is, to fit a (univariate) mixed model with two 
variance components. Under the standard indepen- 
dence and normality assumptions, adjusted treat- 
ment means obtained from the mixed model have 
the same form as the classical least squares means, 
except that the covariate regression coefficient is 
a weighted average of the estimates obtained from 
intra- and inter-block regressions. 

A key point made in this paper is that both uni- 
variate fixed and mixed models for analysis of co- 
variance are inherently conditional on the measured 
covariate values. An obvious question is, therefore, 
what joint distribution for response and covariate 
leads to the conditional models. We show that by 
simply treating the block effects as random in the 
randomized complete blocks design setting, one is 
implicitly assuming that the marginal block vari- 
ance for the covariate is zero; that is, an implied 
model for the joint distribution that is not realistic. 
As a result, the adjusted treatment means obtained 
from the naive, univariate mixed model are biased. 
However, by starting with a bivariate variance com- 
ponents model for the joint distribution of response 
and covariate, one is led to a sensible univariate con- 
ditional model, which properly accounts for the de- 
sign with respect to the covariate. The idea of a 
bivariate model is suggested in Cox and McCullagh 
[(1982), Section 7], but not fully developed. Multi- 
variate variance components models are discussed 
in Khuri, Mathew and Sinha [(1998), Chapter 10], 
but the application to analysis of covariance is not 
considered. The fully conditional approach was also 
advocated by Neuhaus and McCulloch (2006) who 
consider the situation where random effects in a 
generalized linear mixed model may be correlated 
with one of the predictors. Classical likelihood ap- 
proaches lead to inconsistent estimators in this set- 
ting. Results in Neuhaus and McCulloch (2006) 



show that conditional maximum likelihood can elim- 
inate the bias. 

An outline of the remainder of the paper is as fol- 
lows. In the next section we look back historically 
and attempt to explain why some fundamental is- 
sues in analysis of covariance are still unresolved. 
The randomized complete blocks design is discussed 
in detail in Section 3. In Section 4 we show how the 
bivariate model for the randomized complete blocks 
design can be generalized to orthogonal designs, and 
to allow adjustment for multiple covariates. In the 
orthogonal case the conditional model for the re- 
sponse given the covariates implied by the multivari- 
ate mixed model for the joint distribution turns out 
to be a univariate mixed model. This implies that 
appropriate adjustment of treatment means can be 
accomplished using standard software. The method- 
ology is applied to some standard examples of or- 
thogonal designs. The nonorthogonal case is discussed 
in Section 5. Here we show that appropriate adjust- 
ment cannot be accomplished by fitting a univari- 
ate mixed model. However, an EM algorithm for 
fitting a general multivariate linear variance com- 
ponents model is developed using arguments that 
parallel those for the univariate discussed 
in Searle, Casella and McCulloch (1992), Chapter 8. 
The methodology is applied to balanced incomplete 
block designs and unbalanced designs. We conclude 
with some discussion in Section 6. 

2. HISTORICAL PERSPECTIVE 

Since analysis of covariance in designed experi- 
ments has such a long history, readers may won- 
der why confusion over such basic modeling issues 
persists even today. We attempt to explain this by 
discussing the topic in its historical context. Ar- 
guably, the heyday of analysis of covariance was pre- 
1960, predating the widespread use of matrix alge- 
bra in statistics and clearly before the availability 
of high-speed computing. Matrices are two hundred 
and some years old but their use in statistics only be- 
came commonplace in the late 1950s. The very first 
paper in the first issue of the Annals of Mathematical 
Statistics by Wichsell (1930) is entitled "Remarks 
on Regression," yet it has no matrices. It is likely 
that the lateness of adoption of matrices arose from 
their being treated as a topic in pure mathematics 
and, hence, their practical use in statistical mod- 
eling remained hidden. In the classic design books 
Cochran and Cox (1957) and Federer (1955), there 
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is no mention of matrices, whereas in Kempthorne 
(1952) the design problem is formulated as a general 
linear model but is not applied to analyze any ad- 
vanced designs. Kempthorne [(1952), page 66] even 
notes that, at the time, there did not appear to 
be a complete matrix formulation of the general 
linear model anywhere in the literature. In unbal- 
anced data it was a longtime puzzle why statistical 
methods gave two different least squares estimates 
of fixed effects in a one-way classification depending 
upon whether one assumed that one effect was zero, 
or that all the effects summed to zero. The literature 
had certainly not kept up with R. C. Bose's (1949) 
concept of estimability. Nor were 1950s design and 
linear model researchers aware that Penrose's (1955) 
generalized inverse matrix could be used to solve 
the normal equations in the nonfull rank setting, as 
in design problems. With the aid of a generalized 
inverse, Rao (1962) demonstrated how the unique 
unbiased estimators of estimable functions could be 
constructed. In a recent Statistical Science conver- 
sation (Wells (2009)) Shayle Searle points out that 
random effects modeling in the 1950s was quite lim- 
ited and mostly for balanced data. One of the early 
formulations of matrix methods in variance and co- 
variance components analysis can be found in Searle 
(1956). 

An excellent paper by Zelen (1957) reviews the 
thinking at the time for balanced incomplete block 
(BIB) designs. The algebraic manipulations associ- 
ated with the multivariate model described in this 
paper would be very difficult, if not impossible, with- 
out the use of matrix algebra and facility with the 
multivariate normal distribution, mathematical tools 
that were not fully developed in the statistics liter- 
ature at the time of Zelen's paper. We focus on Ze- 
len's discussion of intra- and inter-block regressions, 
which we summarize with the following two quotes 
from Section 3 of his paper: 

in the analysis of covariance, the inter- 
block model will be important if the vari- 
ability of the concomitant variate is large 
for "between blocks" as compared to the 
variability "within blocks." This situation 
may allow more precise inter-block esti- 
mates of the regression coefficients as com- 
pared to the corresponding intra-block es- 
timates 

and later, when discussing the slopes of intra- and 
inter-block regressions, 



Some statisticians, however, have advo- 
cated a more general model which allows 
the intra-block regression coefficients to 
be different from the inter-block regres- 
sion. In this paper, all models are such 
that the intra-block regression is the same 
as that for the inter-block regression. It is 
difficult for this writer to visualize situa- 
tions allowing separate regressions. 

It turns out that the "[s]ome statisticians" Zelen 
referred to were right, but why? Clearly their argu- 
ments were not entirely convincing at the time. The 
bivariate variance components model described in 
Section 3.3 reveals that Zelen's two statements are 
incompatible. In fact, between block variation in the 
covariate implies that the intra- and inter-block re- 
gressions are not the same. Putting it another way, 
forcing the two slopes to be equal amounts to an as- 
sumption that there is no variation among the block 
covariate means. This assumption is clearly violated 
in practical circumstances and therefore leads to bi- 
ased (adjusted) treatment means as well as incon- 
sistent estimates of variance components. 

3. RANDOMIZED COMPLETE BLOCKS 
DESIGN 

3.1 Classical Approach 

Consider a randomized complete blocks design with 
response, Y, and associated concomitant variable, 
Z. Suppose that Z is measured prior to application 
of the treatment but is possibly correlated with the 
response. Let i = 1, . . . ,t be the index for treatment, 
and j = 1, . . . , b be the index for block. The classical 
fixed effects linear model for this design is as follows: 

(1) Yij = n + Ti + fy +~/Zij + Ey, 

where Eij ~ A r (0,dg), independently, and Yi T i = 
Y^j Pj = 0- Notice that replacing the covariate Zij 
with the block centered value Zij — z.j has no effect 
on the fit of this model because the term, —7%, 
can be incorporated into the fixed effect for block j. 
The adjusted mean for treatment i is the estimated 
mean response at a fixed value of Z, conventionally 
taken to be its average observed value, z... 

Throughout this paper Greek letters represent fixed 
effects (unknown parameters), upper case Roman 
letters are random variables or known matrices, and 
lower case Roman letters are either observed values 
of random variables or known constants (or vectors). 
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Table 1 

Adjusted treatment means and standard errors for Pearce's apple yield data 



Treatment 


Univariate fixed 


Univariate mixed 


Bivariate mixed 


Adj. Mean 


Std.Err. 


Adj. Mean 


Std.Err. 


Adj. Mean 


Std.Err. 


A 


280.48 


6.37 


280.41 


13.69 


280.48 


12.98 


B 


266.57 


6.36 


266.55 


13.68 


266.57 


12.98 


C 


274.07 


6.36 


274.05 


13.68 


274.07 


12.98 


D 


281.14 


6.44 


281.32 


13.72 


281.14 


13.02 


E 


300.92 


6.72 


301.33 


13.87 


300.92 


13.19 


S 


251.34 


6.86 


250.85 


13.95 


251.34 


13.28 



Adjusted means and their standard errors for the apple yield data from Pearce 
(1953, 1982), based on fixed effects, univariate mixed effects and bivariate mixed 
effects models. The standard errors involve the ML estimates of variance compo- 
nents. 



With these conventions it is implicit in the notation 
that model (1) characterizes the conditional distri- 
bution of the response given the observed values of 
the concomitant variable. 

We define the inter-block regression model to be 
the implied model for the block means, specifically, 

Thus, in this context, the inter-block model con- 
tains no information about treatment differences, 
or about the regression parameter, 7, because the 
terms, jz.j, are confounded with the block effects. 
We define the intra-block regression model using the 
t — 1 orthogonal Helmert contrasts, h.2,-.-,ht, be- 
tween components of the observation vector, Yj, for 
block j. Specifically, let Y*- = hjYj, for i = 2, . . . ,t 
and j = l,...,b, and similarly define z*- and E*-. 
Then the intra-block model in this case is 

Y* = t* + vz*- + E*- 

where r* = h?V, i = 2, . . . ,t, are t — 1 orthogonal 
contrasts among the treatment means. It follows 
that all the information about treatment differences, 
and the covariate regression parameter, is contained 
in the intra-block model. 

The adjusted treatment means are estimates of 
the mean responses when Z = z... These are given 
by 

(2) Ai.adj = A + n + *tz~ = v%- - - 

for i = 1, . . . ,t, where 7 is the BLUE for 7. These 
do not involve the (estimated) block effects because 
they are averages over the blocks and the block ef- 
fects sum to zero. In the fixed effects case, 7 is the 



ordinary least squares estimate 

- _ z r (C f ®C fe )y 
[6) lols ~ zT(C t ® C b )z ' 

where Ct = It — Jt is the centering matrix of di- 
mension t, and y = (yn,yu, ■ ■ ■ ,Utb) T is the entire 
response vector, with z defined analogously. Since 
C;,J;, = 0, it follows from (3) that 7 ; s is independent 
of the unadjusted treatment mean vector, (I t <S> Jb)y, 
with components, y~i., i = 1, ...,t. Hence, the vari- 
ance formula for the adjusted means based on the 
traditional model with fixed block effects is 
2 2 

(4) var(/i iiad j) = ^ + ^ 7 -^— — (zi. - z..) 2 . 

b z 1 (C t <g> Cb)z 

For numerical illustration we consider the apple 
yield data from Pearce (1953, 1982). In this experi- 
ment there were b = 4 blocks, and t = 6 treatments 
(A, B, C, D, E and S), with S being the standard 
practice in English apple orchards of keeping the 
land clean in the summer. The response, Y, is the 
yield per plot, and the covariate, Z, is the number 
of boxes of fruit, measured to the nearest tenth of a 
box, for the four seasons previous to the application 
of the treatments. The adjusted treatment means 
and their estimated standard errors, based on three 
different models, are reported in Table 1. 

3.2 Univariate Mixed Model 

In most applications the blocks can be regarded 
as a random sample from a population, and it is of 
interest to make inferences about the average treat- 
ment effects across the population of potential blocks. 
In such cases it makes sense to treat the block effects 
as random. Thus, model (1) becomes 

(5) Yij = n + Ti + Bj + jzij + Eij , 
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where now Bj ~ i.i.d. iV(0, a 2 ) independently of the 
random errors, E^j. Replacing the covariate z%j with 
the data centered value, Zij — z.., has no effect on the 
fit of this model because the term, —jz.., can be in- 
corporated into the fixed intercept. However, unlike 
the fixed block model (1), replacing the covariate 
with block centered values, — z.j, does affect the 
fit. 

The inter-block regression model derived from (5) 

is 

(6) Y.j = p + -/z.j + Bj + E.j , 

while the intra-block model is the same as in the 
fixed effects case. Thus, when the block effects are 
treated as random, they are incorporated into the 
error term of the inter-block model. As a result, the 
inter-block model does contain additional informa- 
tion about the covariate regression parameter. In 
particular, it would appear from (6) that the infor- 
mation in the inter-block model will increase with 
the variability of the covariate block means. This 
explains the first quote from Zelen (1957) given in 
Section 2 (albeit for a BIB design). 

The adjusted treatment means based on model 
(5) have the form (2), being the expected treat- 
ment means in repeated sampling (involving differ- 
ent blocks) at a common concomitant variable value, 
Z = z... However, the BLUE for 7 is a weighted av- 
erage of the estimates obtained from the intra- and 
inter-block regression models, where the weights are 
inversely proportional to their variances. Let p de- 
note correlation between any two sample treatment 
means, that is, 

2 

p = cor(Y l .,Y k .)= ° b 2/ . 

Then, it is shown in the Appendix that the BLUE 
of 7 based on (5) has the form 

_ z r [(I t - pj t ) ® C fe ]y 
7mix6d zT[(I t -pJ t )®C b ]z 

(7) 

_ z T {C t ® C b )y + (1 - p)z T (J t ® C b )y 
z T [(I t - pJ t )®C b ]z 

If the block variance dominates the error variance, 
and hence p ~ 1, then the mixed effects estimate 
of 7 is close to the ordinary least squares estimate 
in (3). On the other hand, if the block variance is 
dominated by the error variance, then p ~ 0, and 
the estimate in (7) corresponds to the fixed effect 



case in which the block effects are omitted from the 
model. 

The adjusted means and their standard errors based 
on the mixed effects model (5), for the apple yield 
data, are tabulated in Table 1. The ML variance 
component estimates in this example are a 2 = 194.55 
and a 2 = 553.98, resulting in a correlation estimate 
p = 0.9447, and 7 = 28.89. This compares with the 
ordinary least squares estimate 7 = 28.40. Thus, in 
this case the adjusted mean values are quite similar. 
However, the standard errors reported by the soft- 
ware are quite different. This is because inferences 
from the fixed effects model (1) are restricted to 
the four blocks in the study, whereas those from the 
model (5) apply to the population of blocks. Specif- 
ically, since (7) implies 7 m ixed is independent of the 
unadjusted treatment means, 

var(/i i)ad j) 

_ °l + °l 
b 

z T [(I t -pJ t )®C b }Vl(I t -pJ t )®C b }z 

(zT[(i t - P j t )®c b }zy 

■ (Zi. - z..) 2 , 

where £ = var(Y) = a 2 I t (g> I b + cr 2 3 t (g> I;,. Notice 
that, even as p approaches 1, this variance formula 
still differs from the fixed effects variance given in 

(4) by an additive amount, <J 2 /b, which accounts for 
variation due to sampling of blocks. 

3.3 Bivariate Mixed Model 

As noted in the Introduction, the models (1) and 

(5) are inherently conditional on the observed val- 
ues of the covariate Z. The fixed effects model (1) is 
appropriate if the blocks in the experiment are the 
only ones of interest, whereas model (5) is an at- 
tempt to broaden the applicability of inferences to 
the hypothetical population from which the blocks 
were drawn. An obvious question is what model(s) 
for the joint distribution of (Y, Z) leads to the con- 
ditional model (5)? 

Consider a bivariate model in which the distribu- 
tion of Z is independent of the treatments but allows 
for random variation between blocks and residual er- 
ror. Specifically, 
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where the block effects are i.i.d. bivariate normal 

r 2 



i.i.d. N 2 



a b,y a b,yz 
a b,zy o\ 



independently of the bivariate residual errors, 

2 



i.i.d. N 2 



cr, 



e,zy 



&e,yz 



As before, let Yj = (Y\j, . . . , Y t j) T denote the re- 
sponse vector for the jth block, and similarly define 
Zj. Then the conditionally specified model implied 
by bivariate model (8) can be formally derived using 
the fact that 



(9) 



i.i.d. N- 



21 



VzU 



vlyh + crl y Jt 



Ge,yz\ + &b,yzJ 



,yz J t 



°lz~lt + <J 2 b Jt 



where fi y is the vector of treatment means with com- 
ponents, fj,i >y = fiy + n t y. It follows that 

E(J j \Z j = z j ) 

= Vy + ( a e,yzlt + &b,yzJt)((Tl :Z It + Cfc^Jt)" 1 
■ (Zj - Ufa) 



(10) 



Vy + (Pe^zh + t(Tb,yzJ 



b,z 



b.z 



J t (Zj - l t fJ, 2 



= Hy+JeiZj - l t fj, z ) +J b l t (z.j ~ Hz), 

where 7 e = a eyz ja\ z and 



11] 



lb 



a e,z (J b,yz 



a e,yz a b,z 



^lz(^b,z+ a lz/t) ' 

The conditional variance is 
var(Y J |Z j = Zj ) 

- {&e,yzlt + Vb,yz3t){?l^t + J*)~ 



(12) 



Ce,2?yli + &b,zy^t) 



:,zy 



(&e,yz\t + &b,yz3t, 



• I 



- 5 b ' Z . o Jt ) (&e,zylt + &b,zyJtj 



'e,z t uu b,z 

= a\\t + of J f , 
where a\ = a\ y - o\y Z l<j\ z , and 

(13) o\ =O hy - YleOb.yz +Jb{o'b,yz + 0"e,j/z/*)]- 

It follows from (10) and (12) that the univariate 
conditional model implied by (9) is 

(14) Yij = n + Ti + Bj + -f e Zij + 7 6 % + Eij , 

where fi = \i v - {~f e + %)fi z , n = r i)2/ , and Bj ~ i.i.d. 
JV(0,of) independently of ~ i.i.d. JV(0,a*). The 
inter-block regression model implied by (14) is 



Kj- = + 76e%' + 5j + E.j, 



where 



Ibe = Je + Jb 



&b,yz &e,yz/t 
tf,z+<z/t 



is the slope of the inter-block regression. Thus, in 
this case the inter-block model contains no infor- 
mation about the intra-block covariate regression 
coefficient. Similarly, in the case of a generalized 
linear mixed model, where random effects may be 
correlated with one of the predictors, it is shown 
in Neuhaus and McCulloch (2006) that conditional 
maximum likelihood also leads naturally to the par- 
titioning of the covariate into between- and within- 
cluster components. 

Writing (13) in terms of the intra-block and inter- 
block slopes, 7 e and 7& e , we obtain 

°b = a b,y - le^e,yz/t ~ lbe{&b,yz + cr e ,yz/t). 

Thus, the block variance for the response in the con- 
ditional model is the marginal block variance ad- 
justed for intra- and inter-block regression on the 
covariate. 

The univariate mixed model (5) is the conditional 
model implied by (8) when 7^ = 0, which only hap- 
pens if a? = 0, an unrealistic assumption in prac- 
tice. At this point it is interesting to recall Zelen's 
(1957) comments, quoted earlier, concerning the 
equality of slopes in the inter- and intra-block mod- 
els, and the information in the inter-block model 
about the intra-block slope increasing with the block- 
to-block variability in the covariate. It is now clear 
that these statements are incompatible. Block-to- 
block variability in the covariate implies that the 
inter- and intra-block slopes are different. For this 
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reason the use of the univariate mixed model (5) 
leads to biased estimates of adjusted means and in- 
consistent estimates of variance components as the 
number of blocks increases. 

We define the adjusted treatment means to be the 
expected responses if the covariate values were all 
equal to the average observed covariate value. Thus, 
the model (14) implies 

A*i,adj = j" + T i + (7b + le)Z- 

(15) 

= lM,v + lbe[Z~ - fl z ). 

It is shown in the Appendix that fi z = z.., that ■% 
equals the ordinary least squares estimate based on 
univariate fixed effects model, and that fii^ y = — 
j e (z~i. — z..). It follows that the adjusted treatment 
means based on (14) are identical to (2). 

Estimates of the adjusted treatment means for the 
apple yield data based on (14) are given in Table 1. 
The estimate of the inter-block slope in this case is 
76e = 37.25. This is quite different in magnitude (al- 
though not statistically) from the estimated intra- 
block slope, 7 e = 28.40. Since the intra-block esti- 
mate is identical to those based on the univariate 
fixed effects model, the standard errors for the ad- 
justed means are given by 

/- \ a e+ a b a l - ^2 

Var(/ ^ } = — — + z^(C^C b )z ( ^ " *•> " 

The estimated standard errors are larger than those 
based on the fixed effects model by an additive factor 
of of /£>. This is as is should be, because the scope 
of inference has been broadened to the population 
of blocks. 

Up to now we have assumed that the covariate 
values are not affected by the treatments. If they 
are, then the bivariate model (8) is no longer appro- 
priate. An obvious modification of (8) in this case 
is 



Yi 



Zij 



j-u 



+ 



E, 



Ei 



My ) + f T i,y\ + > B 

JJ'z J \ T i,z J \ ±J 3,z/ \**lj,z. 

The conditional model for Y given Z implied by 
this model has exactly the same form as (14). How- 
ever, the treatment effect parameter Tj is equal to 
T i,y ~ le T iz- This makes sense in that what is be- 
ing estimated is the direct effect of treatments on 
the response mean, as opposed to the indirect effect 
through the covariate. As noted by Bartlett (1936), 
there is reason for caution in this setting due to hid- 
den extrapolation. Comparing conditional expecta- 
tions of treatment means at equal covariate levels 
may not make sense if the treatments affect what 
covariate values are observed. 



GENERAL ORTHOGONAL BLOCKING 
DESIGNS 



4.1 Theory 



Let Z 



(Ziji, . . . , Zij m ) T be a covariate vector 
associated with the response Yij, for i = 1, . . . , k in 
replicate j = l,...,b. Thus, the data matrix for repli- 
cate j is given by 



Y 



2/ 



7 T 

Z 2j 



say. Let Z* r denote the rth column of Zj. Suppose 
that Yj can be decomposed into Yj = fx y + Tj + Uj , 
where \x y is the fixed mean of Yj , which depends on 
the treatments, T,- is the sum random factors asso- 
ciated with treatments (and therefore independent 
of Zj), and Uj is the sum of q random design factors 
plus residual errors. 

We suppose that vec[Uj , Zj] , j = 1, . . . , b, are i.i.d. 
multivariate normal vectors of dimension k(m + 1), 
with means, vec[0fc, /xj<8> where the components 
of [i z are the marginal means of the m covariates, 
and with covariance matrix, V. We say that the de- 
sign is an "orthogonal blocking design" if the matrix 
V has the following structure. Let Ao = Jfc, and Aj, 
I = 1, . . . , q, be k x k matrices with the properties 
that (a) A; is idempotent, (b) A;A;/ = if I ^ I', 
and Yl^o A-l = Ifc- Then we suppose there exist non- 



singular matrices, Go, Gi, 
m + 1, such that 



, Go , each of dimension 



(16) 



Notice that a design can be orthogonal, in this sense, 
regardless of the assignment of the treatments. 

In general, the matrix V is a function of {q + 
1)2(771+ l)(m + 2) free variance and covariance pa- 
rameters which determine the q+ 1 variance-covariance 
matrices, So,...,S , associated with the residual 
errors, and the q random design factors. In particu- 
lar, we note that the variance-covariance structure 



for Uj is 



1=0 



where gi )U ui I = 0, . . . ,q, are scalar parameters, and 
that this structure is that implied by orthogonality 
of the random blocking factors. 
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Example. Consider the RCB design discussed 
in Section 3. In this case there is only one covariate, 
so m = 1. The vector U,- associated with the jth 
block consists of the sum of the block effect and and 
the residual error vector, 



l t Bj + Ej 



Finally, the covariance matrix for (Yj, Zj) T is given 
by 

V = 'E E ®I t + 'E B ® J t 

= (S B + ts B ) ® J* + s s ® (I - J t ). 

The fact that vec[Uy, Zj], j = l,...,b, are i.i.d. 
multivariate normal vectors implies that marginally 
vec(Zj), j = 1, ... ,b, are i.i.d. N([i z <g>lk,*V Z z), where 
u and z subscript combinations are used to denote 
components of the partitioned matrix. Moreover, con- 
ditionally upon Z, JJj, j = 1, have indepen- 
dent normal distributions with means 



E(Vj\Zj = Zj) 
= V M V" 1 (z j 



® lfe) 







z=o J 




q 




«=0 





(g) A; )(Zj -/i 2 



\z=o 
where 7^ : 



] z is a 1 x m parameter vector. 

Since (-yf ® Aj)(/i z ® l fc ) = 7 T /x 2 <g) A;l fc = 0, unless 
/ = 0, in which case it equals 7q , /z 2 1/ c , we have 



the covariates are equal to their respective marginal 
means is 

/^adj = /^ - !fc7o (z* -t*z), 

which generalizes the formula (15) for the RCB de- 
sign. 

The conditional variance of JJj is given by 
var(Uj \Zj = Zj)= V uu - V UZ V~}\ ZU 

q 

= ^2(9l,uu - gl,uzGj~J z gl,zu) ® A ; 



1=0 
Q 

1=0 



MM, 



E(Vj\Zj = Xj) = -7o At 2 lfc + ^2J2^ lrAlZ Jr- 

1=0 r=l 

Finally, since Tj has mean zero, and is independent 
of Zj, 

q m 

E(Yj\Zj = Zj) =K + EE > A '4' 

1=0 r=\ 

where fly = fi y — 7^/x 2 lfc. Thus, the conditional mean 
of the response is given by a linear model with treat- 
ment effects incorporated into fj, y , and covariate re- 
gression effects with slopes, {"fir}, I = 0, ■ ■ ■ ,q, asso- 
ciated with each of the m covariates, r = 1, . . . ,m. 
Since A^lfc = for I > 0, the expected response if all 



corresponding to an orthogonal design with orthog- 
onal partition {A/}. 

Example. Consider again the RCB design of 
Section 3. Note that the conditional mean (10) can 
be reexpressed in the form, 

E(Yj\Zj = Zj) = fj, y + ^ be 3 t Zj + 7 e (It - 3 t )zj, 

where fi y = fi y - ^beVz, and 7 be = 7 e + 76 . Moreover, 
the conditional variance (12) can be reexpressed as 

var(Y j |Z j = Zj) = {a 2 e + ta 2 )J t + a 2 {J t - J t ). 

4.2 Examples 

4.2.1 Split-plot designs. Consider a standard split- 
plot experiment with t whole-plot treatments, each 
replicated r times, and s split-plot treatments in 
each wholeplot. Let Yijh denote response to split- 
plot treatment k, in whole-plot j assigned to whole- 
plot treatment i. Similarly index the covariate val- 
ues Zijk- Then, the marginal models for the response 
and covariate are 

Yijk = fJ-y + OL^y + W^j >y + T k) y + a>Tik,y + Eijk^y 

and 



Zijk — l^z + W(i)j,z + % 



J ijk,z 



respectively. Bivariate normality for the pairs, {Wu\j !y , 
Wu\j >z ) and (E^j^y, Eijk. z ), imply that the condi- 
tional model for appropriate covariate adjustment 
has the form, 

Y ijk = H + a-i + W(i)j +r k + ar ik 

(17) 

^Jw^ij- ~\~ r JeZijk Eijfc, 

where on is the fixed main effect of the ith wholeplot 
treatment, Tf. is the fixed main effect of the fcth split- 
plot treatment, and Wu\j is the random effect of the 
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jth wholeplot replicate nested within the ith whole- 
plot treatment. Milliken and Johnson [(2002), Sec- 
tion 15.4] discuss a split-plot design in the context of 
a cookie baking experiment in which oven tempera- 
ture is the whole plot factor, and cookie type is the 
split-plot factor. The covariate in their example is 
the thickness of the slices of cookie dough, but their 
proposed "equal slopes" model does not include the 
whole plot regression term in (17). 

If the experiment is arranged in b blocks, with 
r replicate wholeplots for each wholeplot treatment 
level in each block, then the marginal model for the 
response is 

Yijkl = fJ>y + B i,y + OLj + {Ba)ij + W^ k) y 

+ n + {Bt)u + (aT)ji + (Bar)iji + E ijki,y 

Since the treatments have no effect on the covariate, 
the marginal model for Z is 

Zijki = Hz + B i)Z + W(ifik )Z + Eijki,z- 

Notice that, in this case, there are random inter- 
actions between the blocking factor and treatments 
that affect the response, but not the covariate. Bi- 
variate normality of the pairs, (5^, _Bj j2 ), (Waj^ y, 
W(ifi kjZ ) and (Eij k i sy , E ijk i yZ ), results in a conditional 
model for the response with a covariate adjustment 
at the individual response level, as well as adjust- 
ments for covariate variation in wholeplot and block 
means, specifically, 

Yijkl = H + Bi + OLj + (Ba)ij + W(i^ k 

+ n + (Bt) u + (ar)ji + (Bar)iji 

+ IbZi- + IwZijk- + leZijkl + Eij k i. 

4.2.2 Latin square design. Let Y^ k denote the re- 
sponse in cell of a latin square design involving 
two random blocking factors and a fixed treatment 
factor, each with b levels. An appropriate model ig- 
noring any covariate information is 

Y r ijk — Mj/ ^i,y ^j,y 1~k Eij k y. 

A marginal model for a random covariate is 

Zrijk — "I" Bi,z ~\~ Cj,z Eij k>z . 

Bivariate normality of the pairs, (Ri t y,Ri tZ ), (Cj t y, 
Cj.z) and (Eij kt y, Eij ktZ ), results in a conditional 
model for the response with a covariate adjustment 
at the individual response level, as well as adjust- 
ments for covariate variation in row and column 
means. That is, 

Y ijk = H + Ri+Cj +T k + J r Zi.. 

+ IcZ.j.. + JeZijk + E ijk . 



4.2.3 Incomplete block designs. Consider an incom- 
plete block design with k < t treatments appearing 
in each block. Let Y,- and Zj denote the response 
and covariate vectors for block j. The arguments 
of Section 3.3 lead to the conditional model (14), 
with the subscript i taking k values in {1, 2, . . . ,t} 
depending on the value of j, and with the block re- 
gression parameter, 75, having the same form as (11) 
with t replaced by k. We note here that even though 
this design is not orthogonal with respect to the re- 
sponse, it is orthogonal from the perspective of the 
covariate. It follows that the appropriate adjustment 
for covariates in an incomplete block design can be 
carried out using a univariate mixed model. 

As an example we consider data from a study con- 
ducted by the National Bureau of Standards, dis- 
cussed in Zelen [(1957), Section 6], to determine 
the effects of four geometrical shapes on the current 
noise of resistors. As described by Zelen, the "ge- 
ometrical shapes were rectangular parallelepipeds 
(all having the same thickness) formed by taking all 
four combinations of 2 widths (wi, W2) and 2 lengths 
(^1,^2)-" Three resistors were mounted on each of 12 
ceramic plates according to a BIB design. The re- 
sponse was the logarithm of the noise measurement, 
and the covariate was the logarithm of the resis- 
tance of each resistor. Estimated treatment effects 
and their standard errors obtained using the uni- 
variate mixed model of the form (5), and using the 
conditional model (14) derived from the bivariate 
model (8), are given in Table 2. There are substan- 
tial differences in both the estimated effects and the 
standard errors obtained using the two models. The 
estimates are also highly correlated, and these cor- 
relations must be taken into account in comparisons 
among the length and width combinations. In par- 
ticular, Zelen considered the interaction contrast, 

7r = (hw\ - I2W2) - {hwi - hw 2 ). 

Estimates of 7r under the two models (5) and (14) are 
0.022 and 0.018 respectively, with standard errors 
0.056 and 0.061. Thus, both models lead to the same 
conclusion that there is little statistical evidence for 
interaction. 

5. NONORTHOGONAL DESIGNS 

5.1 Factorization 

A key feature of the multivariate mixed model 
in the orthogonal design case is that the parame- 
ters in the conditional model for Y (fj, y and ~fi, 1 = 
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Table 2 

Estimated treatment effects and standard errors for Zelen's 

BIB 



Univariate Bivariate 



Treatment 


Effect 


Std.Err. 


Effect 


Std.Err. 


hwi 


-0.519 


0.112 


-0.449 


0.233 


I1W2 


-0.238 


0.029 


-0.229 


0.040 


I2W1 


0.249 


0.031 


0.238 


0.045 


I2W2 


0.508 


0.109 


0.440 


0.226 



Estimated treatment effects and standard errors obtained us- 
ing the mixed models (5) and (14). In each case the variance 
components were estimated REML which explains why the 
first set of estimates (labeled "univariate" ) differ slightly from 
those obtained by Zelen (1957). 



0, 1, . . . , q) are variation independent of those in the 
marginal model for Z (fx z and Gi )ZZ , I = 0, 1, ... ,q). 
The two sets of parameters combined represent a 
1-1 transformation of the bivariate model param- 
eterization (fiy, fj, z , G/, l = 0, 1, . . . , q). In general, 
this decomposition of the parameter space may not 
be possible, in which case appropriate adjustment of 
the treatment means cannot be accomplished using 
a univariate mixed model. To see this, suppose that 
the covariate data is only partially observed, say, 
z = (z Q ,z m ), where z G denotes the observed part, 
and z m the unobserved. Then, the joint distribution 
of the data is 

f(y,z ;n y ,n z ,G) 

= J /y|z(yl z ; V c y , j)fz(z; V z , G zz ) dz m , 

and, hence, the marginal distribution of z is 

J j fr\z(y\z; V L y ,~f)fz(z; Hz, G zz ) dz m dy. 

There is now no guarantee that the parameters that 
determine the marginal distribution of z D will be 
separable from those that determine the conditional 
distribution of y given z D . 

To further illustrate this point, consider again the 
RCB design discussed in Section 3. In this case the 
bivariate model has t + 7 parameters which deter- 
mine the marginal means and block and error co- 
variance matrices, (fj, y , /i z , The parame- 
terization in terms of the marginal model for Z, 
and the conditional model for Y, is a union of two 
variation independent components of dimensions 3 
and t + 4 respectively. Specifically, a% z , a\ z ) U 
(/2,7 e ,76,cjg,a"^), where fi has ith component equal 



to /x + Tj. Now suppose that only k < t covariate 
values are recorded in block j. Let z Jj0 denote the 
observed vector of covariate values (of length k) and 
let 

z.j )0 denote its mean value. Then, modifying the 
arguments that led to equation (10) results in the 
conditional mean 

E(Yj\Zj >0 = z j)Q ) = n y + 7e(zj iG - lkfJ-z) 

+ 7&,o 1 fc(%o - 

where "fb,o has the same functional form as (11) with 
t replaced by k. Thus, if the blocks have different 
numbers of covariate measurements, the parameters 
of the conditional model for Y are not separable 
from those of the marginal model for Z. 

5.2 General Multivariate Mixed Model 

To simplify the notation, we relabel the vector of 
responses as Zo (i.e., Zo = Y). Then Z = vec[Zo, Zi, 
. . . , Z m ] is a vector containing all the responses and 
associated values of m covariates stacked on top of 
one another. Thus, if the number of responses is n, 
then Z has length n x (m + 1). The multivariate 
mixed model described in this paper can be written 
in the form, 

r q 
Z = X/3 + ^C l T l + ^D i B i , 

i=l i=0 

where X determines the means structure, Tj ~ N(0, 
ofl c J, independently for i= l,...,r, are random 
factors associated with treatments, and Bj ~ iV(0, 
S« <g> Idi)> independently for i = 0, 1, . . . ,q, are ran- 
dom (blocking) factors associated with the design, 
with the exception of Bo, which is the residual error 
term (so that do = n). It is convenient to partition 
the matrices X, C, and Dj into blocks consisting 
of the n rows associated with the response, or one 
of the m covariates. Thus, X = [Xq , X^, . . . , X^] T , 
Q = [Cf , . . . , CU T and B, = [Bf , . . . , Bf m f . Note 
that, if the covariates are unaffected by the treat- 
ments, then Cij = for j > 0. The model implies 
that Z has variance-covariance matrix equal to 

r q 

V = var(Z) = C<C«7? + J^D^Ei ® I d jDf . 

8=1 8=0 

We define the adjusted response mean vector as 
its conditional expectation given the covariates eval- 
uated at their estimated mean values. If we partition 



COVARIANCE IN DESIGNED EXPERIMENTS 



11 



the variance matrix, V, into n x n matrix compo- This implies that 
nents, the conditional expectation of the response ,„, ,. r „ ,„ 

vector is given by |S| = |{ d I Ci ^ 2 }| ® I d J| |D (S ^ I„)D T 



£(Y|Z, = l n (i Zi ,i= 1, . . .,m) = ^ TJ^c* }> <j ]J |E, 

= X /3 + ^^([Vy]^)- 1 ^/*^ - l ftMa 





r } ( 1 



J 0| 



.i=i ; u=i 



Here, we have used the notational definitions in because D ° = W ® J «- The complete data log- 

Searle, Casella and McCulloch [(1992), Section 8.3] . llkellhood 18 therefore 

Thus, for example, [ r V 0i ] = [V i,..., V 0m ]. The es- 1 i 2 1^,, , v 

timate of the adjusted mean response vector is there- 2 C * ^ _ 2 ^ 

fore 4=1 4=0 

A adj = X /3 = XotX^Xr^V^Z. _ 1 £ TfTi _ 1 £ Bf ^ ^ ^ 

A "naive" variance-covariance formula for the ad- 4=1 4=0 

justed mean vector, ignoring variability due to the where 

estimation of V, is given by the conditional variance r q 

of Aadj assuming V is known. Specifically, B = Z - X/3 - ^ QT; - ^ DjBj 

var(/i adj ) = Xo(X T V- 1 X)- 1 X T [V VS V^]™ =0 1=1 

depends on the parameter 3. It follows that the 

• X(X V X) X , maximum likelihood estimates based on the com- 

where [V^] = V" 1 , and V* = V 00 - [ r V oi ] • plete data are 

avi^rHcVoj}. (18) & 2 = ± T T Th i=1> ... >rj 



c. 

1 



, i = 0,...,q, 

j,k=o 



5.3 EM Algorithm 

The distributional assumptions described above (19) ~£i 
imply that the "complete" data vector, 

(rjT rpT rpT R T T3 T '\ T &nd 

{/j , J-i , • • • , i r ,-Di , • • • ,r>g J , 

has a multivariate normal distribution with mean = X[X (So (8> I n ) X] X (So ( 

(/3 T X T , T ) T . The assumptions imply the covari- ^ > ( r i \ 

ance between Z and Tj and Bj are, respectively, ■ I Z — ^^C^Ti — ^^DjBj J . 

cov(Z, Ij ) = Cjfjj , 

and 



is 



The EM algorithm consists of iteratively replacing 

Tj and Bj in (20), and TfTj and B^.B ifc in (18) 

cov(Z, Bf ) = Dj(5]j (g) I^). and (19), by their conditional expectations given the 

observed data Z. These expectations are straightfor- 
Thus, the joint density of the complete data vector w&rd tQ calculate becauge the conditional distribu- 

tions involved are multivariate normal. Specifically, 

/(z,t,b) = [2^S|- 1 /2 exp (_Q/2), Tl |Z = z ~ NtfC? V-\z - X/3), 

= [(z-X/3) r ,t^,b r ]S- 1 [(z-X/3) r ,t r , a 2j .^cry-lei 

1 

r „ independently, for i = 1, . . . , r, and 

V {rCiCTf} 

{cCfaf} { d Icrf} E(B t \Z = z) = (Si 55 I*)Df V^z - X/3), 

{c (S^I di )Df} (F var(B i |Z = z) = E i 0l di 



{rDiCSi®^)} 





independently, for i = 1, . . . , q. 
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Table 3 

Adjusted means and standard errors for unbalanced apple 
yield data 





Covariate 


Response 


Response 


Std.Err. 


Treatment 


mean 


mean 


Adj. Mean 




A 


8.53 


283.67 


269.29 


13.35 


B 


8.40 


266.67 


255.69 


13.35 


C 


8.35 


275.25 


271.62 


12.73 


D 


7.93 


270.25 


277.47 


12.73 


E 


7.48 


277.25 


295.96 


12.73 


S 


9.30 


279.50 


251.63 


12.73 



Adjusted means and their standard errors for the apple yield 
data from Pearce (1953, 1982), with covariate and response 
data missing for treatments A and B in block 1. The stan- 
dard errors were computed using equation (17) and the ML 
estimates of variance components. 



5.4 An Unbalanced Example 

Consider the apple yield data from Pearce (1953) 
discussed in Section 2. Suppose that the observa- 
tions (both covariate and response) were missing 
for treatments A and B in block number 1. The 
adjusted means based on this unbalanced data are 
given in Table 3. The adjusted means are evalu- 
ated at the ML estimate of the covariate population 
mean, jl z = 8.2080, which is not the same as the 
overall mean covariate value, z = 8.3182, because of 
the imbalance with respect to treatments. The stan- 
dard errors for the adjusted means for treatments 
A and B are larger than for the other treatments 
because they are based on observations from three 
blocks rather than four. 

6. DISCUSSION 

The traditional methods for covariate adjustment 
of treatment means in designed experiments are in- 
herently conditional. In order to develop a coher- 
ent general methodology, we have proposed a multi- 
variate variance components model for the joint dis- 
tribution of the response and covariates. We have 
shown that, if the design is orthogonal with respect 
to blocking factors, then appropriate adjustments to 
treatment means can be made using the univariate 
variance components model obtained by condition- 
ing on the observed covariate values. As noted in 
Section 5, the key to this is the factorization for the 
joint distribution of (Y,Z), 

f(y,z;9) = f Y]z (y\ Z ;9 1 )f z ( Z ;9 2 ), 

where the conditional density fy\z defines a univari- 
ate linear mixed model for the response variable Y, 



and where 9 = (#i,02) and 9\ and 9% are variation 
independent. 

Our approach reveals the fact that some widely 
used models generate biased adjusted means and in- 
correct standard errors because the assumed condi- 
tional model imposes unrealistic constraints on the 
joint distribution. Our multivariate model also clar- 
ifies some issues that have been the source of long- 
standing confusion in the statistics literature. One 
such example is in the analysis of balanced incom- 
plete block designs. As noted by Zelen (1957), "With 
respect to the non-covariance situation, most statis- 
ticians agree that the inter-block analysis may be 
important if the number of blocks is 'large' or if 
the variability between blocks is 'small'." However, 
what is less understood is that the same statement is 
true in the analysis of covariance. The multivariate 
analysis makes this clear because it reveals that be- 
tween block variation in the covariate implies that 
the slope of the inter-block regression is different 
from that in the intra-block regression. 

In the multivariate model discussed in this paper, 
we assume that the effect of the covariates is the 
same for all treatments. It is common in the liter- 
ature for authors to consider models in which this 
is not the case. For example, one can easily mod- 
ify the univariate analysis of covariance model (5), 
for a randomized blocks experiment, to allow the 
slope of the covariate regression to depend on the 
treatment (see, for example, Milliken and Johnson 
(2002), Chapter 9). However, as we have shown, 
this univariate analysis is incorrect because it fails 
to account for the block regression with respect to 
the covariate. If the block regression components 
are included in the model, should these also de- 
pend on the treatments? It is the opinion of these 
authors that the correct univariate model for co- 
variate adjustment, if one exists, must be motivated 
by a multivariate model for the joint distribution 
of the response and the covariates. For example, a 
conditional model for the response in which the re- 
gression slopes depend on the treatments is implied 
by a multivariate model in which the error covari- 
ance structure is heterogeneous across treatments, 
but this model also implies that the conditional er- 
ror variances are heterogeneous across treatments, 
an assumption that is not typically made. In addi- 
tion, it seems unnatural to assume heterogeneity in 
the error covariance structure unless there is also 
heterogeneity in the block variance-covariance ma- 
trices. Thus, it is unclear to these authors if there is 
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a coherent univariate analysis that allows covariate 
effects to depend on the treatments. 

The ideas presented in this paper underscore the 
importance of proper model specification and care- 
ful parameter interpretation in regression analysis of 
blocked and clustered data. The formulation of the 
multivariate model guards against ad hoc formula- 
tion and misspecification of the regression model by 
omitting the block-level mean effects that may seri- 
ously bias the estimate of the individual- level effects. 

A number of articles have explored particular types 
of adjusting and centering for block and cluster means 
There are several reasons for adjusting for the block 
and cluster means. As noted by Berlin et al. (1999), 
variability in block and cluster means is common, 
and can confound the estimated association between 
the individual-level exposure measurement and out- 
come; adjusting for the cluster mean may remove 
confounding bias. Similarly, Neuhaus and Kalbfleisch 
(1998) argue that inference on the individual- level 
effects can be misleading without adjustment. Both 
Kreft, de Leeuw and Aiken (1995) and Raudenbush 
and Bryk (2002) articulate the need for evaluating 
block and cluster-level effects as predictor variables 
in their own right. The paper by Begg and Parides 
(2003) reviews different heuristic adjustment and 
centering approaches for the separation of individual- 
level and block/cluster- level effects on response and 
their appropriate interpretation. In this paper we 
suggest a multivariate model that automatically 
yields the best adjustment and centering suggested 
by Begg and Parides (2003). 

Throughout this article we assumed a joint normal 
multivariate model. It is well known (see Cambanis, 
Huang and Simons (1981)) that conditional moment 
calculations are robust with respect to the family of 
elliptically contoured distributions. That is, if two 
random vectors have a joint elliptically contoured 
distribution, then the conditional distribution of one 
given the other is also elliptically contoured. The lo- 
cation and scale parameters of the conditional dis- 
tribution do not depend upon auxiliary parameters 
of the joint distribution, and consequently, the con- 
ditional mean and covariance calculations which ap- 
ply in the normal case are valid in this more general 
elliptically contoured setting as well. 

In the Bayesian context Gelman (2005) presents a 
general hierarchical regression approach for AN OVA 
problems in which effects are structured into ex- 
changeable batches. In this sense, ANOVA is a spe- 
cial case of linear regression, but only if hierarchi- 



cal models are used. In fact, the batching of ef- 
fects in a hierarchical model has an exact counter- 
part in the rows of the analysis of the variance ta- 
ble. In the case where the batches are nonexchange- 
able Gelman (2005) recommends subtracting batch- 
level regression predictors, then additive effects for 
the factor levels in each batch could be modeled 
as exchangeable. The proposed multivariate vari- 
ance components model for the joint distribution 
of the response and covariates would be a better 
starting point for the hierarchical modeling in the 
case of nonexchangeable batches. Assigning prob- 
ability distributions for the treatment effects and 
variance components automatically leads to coher- 
ent Bayesian inferences for the analysis of the co- 
variance model. 

Our modeling strategy has assumed, as is tradi- 
tional in designed experiments, that the covariate 
values are not affected by the treatments, for exam- 
ple, because they were measured prior to application 
of the treatments. From a graphical models view- 
point, our model is B — > Y 4— Z 4— B, where B = 
(B y ,B z ). It is a diversion to try to frame the model 
in this article on a causal inference scaffold since the 
inferential goals are quite different. In this article we 
have outlined a coherent framework for the adjust- 
ment of treatment means in designed experiments 
that account for one or more covariates, whereas in 
causality one is trying to assess an intervention ef- 
fect (of 2 on F) in the presence of a background 
variable (B) (Cox and Wermuth (2004)). The com- 
monality of the two issues lies in the fact that in both 
one is trying to sort out a set of consistent condi- 
tional relations within a system of random variables. 
A goal in casual modeling is to address the overall 
regression coefficient of Y on Z where B has been 
decoupled from Z, that is, B and Z are nonadja- 
cent in the graph. A consequence of this decoupling 
is that 7b in (11) equals zero so that the partial and 
overall effect Z on Y coincide, in which case the 
conditional model implied by (8) reduces to the uni- 
variate mixed model (5). Separating the block effect 
from the covariate massively restricts the scope of 
possible models. By starting with a bona fide multi- 
variate model for the joint distribution of response, 
covariates and blocks, one is led to a sensible uni- 
variate conditional model, which properly accounts 
for the design with respect to the covariate. 

Finally, we note that the multivariate variance 
component model has interesting applications be- 
yond just analysis of covariance. For example, the 
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generalization of a paired i-test for a univariate re- 
sponse to multiple observations per subject is a mixed 
effects model with between and within subject error 
components. If the response is multivariate, then a 
multivariate variance components model allows the 
same generalization to repeated multivariate mea- 
surements. The use of multivatiate variance compo- 
nents models for repeated measures analysis is con- 
sidered in Khuri, Mathew and Sinha (1998), Chap- 
ter 10. 

APPENDIX: ML ESTIMATION BASED ON 
THE BIVARIATE MODEL FOR A RCB DESIGN 

The representation of the bivariate model given 
in (9) implies that the joint density of (y, z) can be 
factored, 

b 

/(y, z ) = Y[fY\z(yj\ z j)fz(zj)- 

Likelihood-based inference can equivalently be based 
on the joint density of a 1-1 transformation of the 
data vector. Specifically, let denote the Helmet 
matrix of dimension t, and consider the transfor- 
mation, (Yj-.Zj-) (Y*,Z*) = (I^Y^H^Zj), for 
j = 1, . . . ,b. The (i,j)th component of Y* is YJ* = 
h?Yj, where hj is the ith column of H. Similarly, 



i j 



hfZ r 



Now, using the facts that h^h,;/ = for i ^ i', 
hfh{ = 1, and hj 1 = 0, for i = 2, . . . , t, it is straight- 
forward to verify that the pairs, (Y*-,Z*-), i = 1, . . . , t 
and j = 1, . . . , b, are mutually independent. Further- 
more, 



v* 



i.i.d. iV? 



h, z 



and for each i = 2, . . . , t, 

£i j ~ i.i.d. iV 2 
where 9. 



j = l,...,b, 
j = !,...,&, 



that 



tf y — hj fi y and 9i iZ = h{ lfJ, z . It now follows 



i.i.d. N{9 ljZ ,a +ta b J , 



i 
J' = l 



, . . . , iv, 



Y *j\ Z lj = Z lj ~ LLd - N (0l,vz + IbeZijjObe), 



i.i.d. iV(0,a e 2 i2 ), 



2,...,t,j = l,...,b, 



and for i = 2, . . . , t, 



Z- ■ ~ 



i.i.d. N(9i !y + j e z*j,al), 



j = l,...,b, 

where j be = (a £j y Z + ta hiyz )/ (a\ z + iof j = j b + 7 e , 

01,1/3 = 6l,y-lbeQl,z, and of e = y + tu\ y - 7fc e (cTg,^ + 

From these distributional results we can easily 
deduce the ML estimates. In particular, 9\ tZ = z\. 
which implies jl z = z.., 



Ibe 



z T (J t 



C b )y 



z T (J t ®C 6 )z' 

i=2 2^j=l\ z ij ~ z i-)Vij 

J2i=2 J2j=i( z ij ~ z t) 2 

Ej=l Ej=l( z ij ~ z i- ~ % + z --)Vij 
J2i=l Ylj=l( z ij ~ ; 

z T (Q®C 6 )y 



Z.4 + Z.. 



CMz' 



Oi,yz = V*. -Jbe z t and 6 itV = y* - %z* ,i = 2,...,t. 
Note that 9i jV ^ y*. Also, the ML estimate of 7 e 
is identical to the OLS estimate of 7 based on the 
standard fixed effects model (1). Also, 9\ iV = y*. = 
hfy., but 9 ijV = y* - %z* = hf (y. - 7 e z.), for i = 
2, . . . , t. Hence, 



H6>, 



HH T y. - 7e HH T z.+ 7e lz.. 

7e(z. - If..), 



which agrees exactly with the adjusted mean for- 
mula (2) based on the fixed effects model (1). 

Finally, if there is no between block variation in 
the covariate (i.e., a bz = 0), then jj, = 0. In this case, 
7be and % are independent estimates of j e . The ML 
estimate of -y e in this case is a weighted average 
of the two independent estimates, with weights in- 
versely proportional to their conditional variances. 
Since 

var(Y|z) = (a% + of J) ® I b 

= (a 2 e +ta 2 b )[(l-p)I t + pJ t ], 

it follows that 



var(7 be |z) 
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and 



var( 7e |z) = (1 - p ) — 



C 6 )z 



This implies that 



7e,MZ 



z T [(J t + l/(l-p)C t )»C 6 ]y 



(1/ 



-V(i-P)c t ) 

pit) <8> C 6 ]y 



z T [(It-pJ t )®C 6 ]z' 
which agrees with (7). 
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